Skip to content

#701 edit configure-yaml tool for yaml changes#802

Merged
singhd789 merged 12 commits intoNOAA-GFDL:701.refinediagfrom
singhd789:refinediag-preanalysis
Mar 30, 2026
Merged

#701 edit configure-yaml tool for yaml changes#802
singhd789 merged 12 commits intoNOAA-GFDL:701.refinediagfrom
singhd789:refinediag-preanalysis

Conversation

@singhd789
Copy link
Copy Markdown
Contributor

Describe your changes

Issue ticket number and link (if applicable)

Checklist before requesting a review

  • I ran my code
  • I tried to make my code readable
  • I tried to comment my code
  • I wrote a new test, if applicable
  • I wrote new instructions/documentation, if applicable
  • I ran pytest and inspected it's output
  • I ran pylint and attempted to implement some of it's feedback
  • No print statements; all user-facing info uses logging module

Note: If you are a code maintainer updating the tag or releasing a new fre-cli version, please use the release_procedure.md template. To quickly use this template, open a new pull request, choose your branch, and add ?template=release_procedure.md to the end of the url.

rose_suite.set( keys = ['template variables', 'DO_PREANALYSIS'],
value = do_preanalysis )
rose_suite.set( keys = ['template variables', 'PREANALYSIS_SCRIPT'],
value = quote_rose_values(script) )
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't believe how efficient you are, Dana, and thank you for continuing this.

One questionable aspect of how we set this up in the flow.cylc was having only one preanalysis script. (We allowed multiple refinediag scripts with the jinja wizardry from @meteorologist15 )

So if the user has multiple preanalysis scripts, we should give the bad news ("Known deficiency that will be fixed in 2026.02").

In our first yaml attempts, we used lists when dictionaries would suffice because that's how the XML looked. But dictionaries are more attractive visually, more pythonic, and supported by uw compose. So that's why I thought to make the preanalysis and refinediag sections dictionaries.

But the dictionary key is the unique user-specified label. It could be vitals but it could be something else. Both of these should be accepted and equivelant.

  preanalysis:
    vitals:
      script: "/home/mdteam/DET/analysis/vitals/gfdlvitals_refineDiag_om5.latest.csh"
      inputs: ['atmos_month', 'aerosol_month']
      do_preanalysis: True
  preanalysis:
    my_db_script:
      script: "/home/mdteam/DET/analysis/vitals/gfdlvitals_refineDiag_om5.latest.csh"
      inputs: ['atmos_month', 'aerosol_month']
      do_preanalysis: True

Copy link
Copy Markdown
Contributor Author

@singhd789 singhd789 Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of validation was something I asked Paul and Christina about for uw tools (because we would be defining arbitrary components for each experiment)!

The verdict: We can have both arbitrary key names (JSON Schema refers to these as "additional properties", i.e. property (key) names that are not explicitly declared) and on top of that, we can still define requirements for their contents.

Copy link
Copy Markdown
Contributor Author

@singhd789 singhd789 Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't explore uw config validate too too deeply, but if this is how they validate, it seems we could definitely use that as well. It seems to relate to our discussions for where the schema could live and how we could potentially define it in the yaml or something. With uw config validate it looks like you pass in both the input path and schema path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now (before uw tools work), it looks like we could use additional_properties in the schema and then define the content requirements

Copy link
Copy Markdown
Contributor Author

@singhd789 singhd789 Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't believe how efficient you are, Dana, and thank you for continuing this.

One questionable aspect of how we set this up in the flow.cylc was having only one preanalysis script. (We allowed multiple refinediag scripts with the jinja wizardry from @meteorologist15 )

So if the user has multiple preanalysis scripts, we should give the bad news ("Known deficiency that will be fixed in 2026.02").

In our first yaml attempts, we used lists when dictionaries would suffice because that's how the XML looked. But dictionaries are more attractive visually, more pythonic, and supported by uw compose. So that's why I thought to make the preanalysis and refinediag sections dictionaries.

But the dictionary key is the unique user-specified label. It could be vitals but it could be something else. Both of these should be accepted and equivelant.

  preanalysis:
    vitals:
      script: "/home/mdteam/DET/analysis/vitals/gfdlvitals_refineDiag_om5.latest.csh"
      inputs: ['atmos_month', 'aerosol_month']
      do_preanalysis: True
  preanalysis:
    my_db_script:
      script: "/home/mdteam/DET/analysis/vitals/gfdlvitals_refineDiag_om5.latest.csh"
      inputs: ['atmos_month', 'aerosol_month']
      do_preanalysis: True

So with the preanalysis scripts, will it be structured the same as refineDiag here eventually? (with multiple scripts, inputs, ... under different keys) or would it be multiple scripts under "vitals"?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, preanalysis scripts are identical to refinediag scripts except they produce no output. So they will have inputs (which are optional for now), but no outputs.

for k2, v2 in v.items():
script = v2["script"]
#                rds.append(scripts)
rds = f"{rds} {script}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks good!

@singhd789 singhd789 marked this pull request as ready for review March 27, 2026 19:08
Copy link
Copy Markdown
Contributor

@ceblanton ceblanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. thanks!

The testing and the >2 preanalysis bug can be addressed after the alpha2 tag.

script: "/path/to/aerosol-script"
input: ['aerosol_month']
output: ['aerosol_refined']
do_refinediag: True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example yaml looks good, but it won't actually work because the scripts aren't there yet.

The test cases can come later we agreed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yeah, just wanted to get the example started

@ilaflott ilaflott force-pushed the refinediag-preanalysis branch from ec79e4b to 4993d20 Compare March 30, 2026 15:33
@singhd789 singhd789 merged commit 7c21d85 into NOAA-GFDL:701.refinediag Mar 30, 2026
1 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants